-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor code clean-ups and performance improvements #906
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Thorium, this is quite insightful.
It sparks some interest to detect these things via analyzers.
Before I dwindle in that territory, did you detect these manually? Or use some tooling?
Last question: is any of this noticeable? We don't have a benchmark so I'm not asking you to add that but did you run this before and after? Just curious.
@@ -15,8 +15,12 @@ open FSharp.Formatting.Markdown | |||
|
|||
module String = | |||
/// Matches when a string is a whitespace or null | |||
[<return: Struct>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any idea under which conditions you would start using this?
- What is the minimum language version?
- Always for unit or can this be used for other return types as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Struct constructs came with FSharp.Core 6.0.1 so that's the minimum version.
Works with other return types too.
|
||
module List = | ||
/// Matches a list if it starts with a sub-list that is delimited | ||
/// using the specified delimiters. Returns a wrapped list and the rest. | ||
let inline (|DelimitedWith|_|) startl endl input = | ||
let inline internal (|DelimitedWith|_|) startl endl input = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So internal
because this is used in other files in the project?
Otherwise private
would also work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to limit the scope to this solution: As you know inline makes compiler slower. So if you have hundreds of public inline functions in your solution and someone refers your project via Nuget package, they are going to pay the price with their compilation too, because those inline-functions come available for them too, even when they probably are not aware of them and probably will never need them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my reasoning only, and I might be wrong:
If do code like this: myInt |>
and now you ask your F# IDE to auto-complete, then when it knows the type of "myInt" then it can give proper function options: Those who take int in this case.
But in case of inline functions, they use statically resolved type parameters.
So if you have a referenced 10s other nuget packages and all have 10s of inline functions, then... all of those might be potentials for your auto-complete. You probably don't want call those, so it's just noise that could be avoided if they'd be internal functions.
|
||
open ArgParser | ||
|
||
[<Struct>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The heuristic would be here to add [<Struct>]
when the DU only has cases without any fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Enums and similar are struct per default, so if you do this:
type MyE =
| A = 1
| B = 2
it's fine as is. But if you do this:
type MyE =
| A
| B
...it's not a struct. So basically it'll generate a class of MyE containing a class of A and class of B and an instance of either.
FSharpLint can detect the null-check, but it's a problematic project because it seems to break on each new major .NET release (which are often) and it randomly crashes on many other cases. The others were manual. |
General guidelines about structs (and I don't have good references to point here): Keep your structs small. By small I mean a bool or guid or int or int64 or another struct, but not bunch of other classes or interface-implementations of any kind or big lists or anything like that. Struct is read to memory at once and kept there, so it's not a pointer to somewhere else, and it makes the processing fast and footprint small. The limitation is function parameters, structs are copied and not sent as pointers, which means if you have big structs, then function calls go slow. Which you can avoid by making your structs as ByRef and so on, but then they are mutable and it goes as hack-hack rather than proper enterprise code. But that's why e.g. System.Text.Json is so fast, they use big structs byref and trust to take 100% out of your single core. If you have a simple object having a few non-class fields, then it's better to be struct. If you have a 1000s of these small objects it will really improve the performance and memory footprint when processing them through. Option class is not a struct, but there is a struct-corresponding ValueOption. ValueOption has IsSome and IsNone and typically the usage with match-clause will just work. And there is ValueOption.bind and so-on like Option.bind you've used to. But if you use lot of existing F# functions like List.tryPick they work with Some x and not ValueSome x, and there I'd go for usability over performance. Tuple class is not a struct but, but there is a ValueTuple which is used instead of So this is still good: [<Struct>]
type MyType =
| MyOp of int
| MyOp2 of bool ValueOption
| MyOp3 of struct(bool*bool) They also say you should performance-test, to verify your changes, because many times it depends on use-case if a harmless looking So for conservative analysis, I'd go for only non-data-carrying and/or only plain boolean/int/guid/struct-field types to be recommended to be structs. |
Thanks for all the insight!
|
@@ -464,26 +463,29 @@ module internal ArgParser = | |||
else | |||
None | |||
|
|||
[<return: Struct>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nojaf here is an example where there is a Struct boolean. As boolean is a tiny field, there is no point of wrapping it to inside a separate class, like the original code did. Wrapping just makes the processing of comparisons etc slower.
The comparisons are probably microseconds, so I expect it's not human noticeable in a single run of generating the docs, but if you have build-pipeline over every commit queuing generating docs, then even small changes can affect over time, to free up some CI workers faster. |
Sure, that is fair. I just wanted you to admit that you didn't check 😉. |
Here is quite detailed reference post of Structs: https://www.bartoszsypytkowski.com/writing-high-performance-f-code/ |
Code clean-up with minor performance optimizations:
No functional changes.